This specification relates to scheduling resource crawls.
Having fixed or limited computing system resources (e.g., central processing unit time, memory space, and storage space) for crawling presents a challenge in keeping pace with the ever expanding web. In particular, balancing the allocation of crawl resources between web pages can sometimes lead to crawling some web pages more often than necessary (e.g., if the content has not changed), and crawling other web pages less often than necessary, and thus, contributing to a miss rate for content updates.